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Abstract 

Purpose: Epithelial-Mesenchymal Transition (EMT) features appear to be key events in development and progression of 
breast cancer. Epigenetic modifications contribute to tlie establishment and maintenance of cancer subclasses, as well as to 
the EIVIT process. Whether histone variants contribute to these transformations is not known. We investigated the relative 
expression levels of histone macroH2A1 splice variants and correlated it with breast cancer status/prognosis/types. 

Methods:lo detect differential expression of macroH2Al variant mRNAs in breast cancer cells and tumor samples, we used 
the following databases: GEO, EMBL-EBI and publisher databases (may-august 2012). We extracted macroH2A1.1/ 
macroH2Al mRNA ratios and performed correlation studies on intrinsic molecular subclasses of breast cancer and on 
molecular characteristics of EMT. Associations between molecular and survival data were determined. 

Resu/ts:We found increased macroH2Al.l/macroH2Al mRNA ratios to be associated with the claudin-low intrinsic subtype 
in breast cancer cell lines. At the molecular level this association translates into a positive correlation between macroH2Al 
ratios and molecular characteristics of the EMT process. Moreover, untreated Triple Negative Breast Cancers presenting a 
high macroH2Al.l mRNA ratio exhibit a poor outcome. 

Conc/usion: These results provide first evidence that macroH2Al.l could be exploited as an actor in the maintenance of a 
transient cellular state in EMT progress towards metastatic development of breast tumors. 
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Introduction 

Triple-Negative Breast Cancer (TNBC) is clinically defined by 
the lack of expression of the estrogen (ER) and progesterone (PgR) 
receptor genes, and by the absence of amplification of human 
epidermal growth factor receptor-2 (HER2). Treatment of TNBC 
has been challenging due to its heterogeneity at the molecular level 
and the absence of well-defined molecular targets [1,2]. Despite a 
frequent complete response to neoadjuvant chemotherapy, TNBC 
patients also have a higher rate of long term recurrence and worse 
prognosis than ER-positive BC patients. Distinguishing chemore- 
sistant TNBC patients at risk to relapse from those with a relatively 
favorable prognosis, would help to identify clinically relevant 
subgroups that could benefit from alternative treatments. 

Advances in gene expression profiling have permitted charac- 
terization of different intrinsic molecular subtypes present in 
TNBC [3]. One of these, the claudin-low breast cancer subtype 
[4], is characterized by mesenchymal features, low expression of 
cell-cell junction proteins (i.e., E-cadherin), and intense immune 
infiltrates. Furthermore, claudin-low tumors have unique biolog- 



ical properties linked to mammary stem cells [5] and Epithelial- 
Mesenchymal Transition (EMT) features [6]. 

Gene expression during EMT is dependent on specific 
transcription factors that interact with enhancer or promoter 
elements, the accessibility of their binding sites which is regulated 
by epigenetic reprogramming [7,8]. Hence, chromatin reorgani- 
zation could contribute to the regulation of epithelial plasticity [9- 
1 2] . To date however, the presence of histone variants has not 
been investigated with respect to the phenomenon of EMT. Gene 
expression accompanying EMT is also regulated at the post- 
transcriptional level via alternative splicing of RNA [13-15]. 

The histone variant macroH2Al is a vertebrate-specific 
member of the H2A family and is unusual due to the presence 
of a C-terminal macro domain [16]. Two isoforms, macroH2Al.l 
and macroH2A1.2 are produced by alternative splicing of the 
H2AFY gene. Both isoforms have been associated with silencing 
and transcriptional repression [17-19]. Regulation of macroH2Al 
expression seems to be linked to self-renewal and commitment of 
ES cells, representing a barrier to reprogramming pluripotency 
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[20-22]. In melanoma, loss of macroH2Al promoted progression 
of metastasis [23]. Moreover, high levels of macroH2Al.l are 
associated with slowly proliferating cancers, whereas highly 
proliferating tumors have markedly decreased macroH2Al.l 
levels. Com-ersely, macroH2A1.2 expression is independent of 
proliferation in all tumours [24—26]. Notably, expression of 
macroH2Al.l has been identified as a novel biomarker in lung 
and colon cancer models [25,26]. 

In this study, we demonstrate that selective splicing of the 
H2AFY gene is correlated with EMT features linked to Claudin- 
low breast cancers. We propose that macroH2Al.l expression 
levels could participate in the epigenetic program linked to poor 
clinical outcome of this molecular breast cancer subtype, and more 
generally in the EMT process. 

Materials and Methods 

Cell culture 

MCF-7 and MDA-MB231 were obtained from ATCC. ZR-75, 
MDA-MB436 and Hs578T, were a gift from G. Freiss 
(Montpellier, France), originally purchased from ATCC [27]. 
MDA-MB231, MDA-MB436 and Hs578T cells were maintained 
in DMEM high glucose with glutamax. MCF-7 cells were 
maintained in DMEM/F12 with Glutamax. ZR-75 cells were 
maintained in RPMI- 1 640 supplemented with 1 0 mM Hepes. AH 
these media were supplemented with 10% heat-inactivated fetal 
bovine serum and 1 mM sodium pyruvate. 

Protein quantification 

Antibodies against macroH2Al (07-219; Upstate), 
macroH2Al.l and macroH2A1.2 (gift by A. Ladurner), ERa 
(sc-543; Santa Cruz), GAPDH (MAB374; MiUipore), H3 (abl791; 
Abeam) were used for immunoblotting. To discriminate between 
the two splicing isoforms of macroH2Al, macroH2Al.l and 
macroH2A1.2, total cell extracts were separated on low cross- 
linking (12% acrylamide, 1:125 bisacrylamide) SDS-polyacryl- 
amide gels and blotted with antibodies specific to one of the two 
isoforms (Fig. SI) specifically. Protc'ins wctc (juantified using the 
Image Gauge software. Expression levels of each isoform and total 
macroH2Al were normalized to GAPDH. ZR-75 expression was 
used as sample reference. 

RNA extraction, reverse transcription and Quantitative 
PCR analysis 

Total RNA was extracted using an RNeasy mini kit (Qiagen). 
First strand cDNA was generated using the ThermoScript RT- 
PCR system (Invitrogen) and used as the template for quantitative 
PCR (qPCR) using the platinium SYBR Green qPCR SuperMix 
(Invitrogen) according to the manufacturer's instructions. Gene- or 
spHce variant-specific primers are shown in Fig.S2A. Relative 
levels of RNA were determined using the threshold cycle (AC7) 
method [28] . Expression levels were normahzed to the ribosomal 
RPLPO gene. ZR-75 expression was used as sample calibrator. 

Determination of the macroH2A1.1/macroH2A1 mRNA 
ratio 

A summary of probe set IDs used is reported in Fig.S2. Among 
the probe set ID from HG-U133A, three detect the expression of 
H2AFr gtnt: 214501_s_at and 207168 _s_at probe set ID which 
are common to the two isoforms and 214500_at which recognized 
specifically the sequence of the exon 6a of macroH2Al.l and 10 
nucleotides in exons 5 and 7 common to macroH2Al.l and 
macroH2A1.2 isoforms (Fig.S2). We extracted the corresponding 



log2 RMA values from the different GEO datasets studied and 
determined relative expression of macroH2Al (mean value of 
214501_s_at and 207168_s_at), macroH2Al.l (214500_at) and 
calculated the macroH2Al.l/macroH2Al mRNA ratio by the 
following formula: 

log 2(214500_a0 - [log 2(214501 j_aO + log 2(207168 j_a0]/2 
= log ir'^croHlA 1 . ^l^^„„H2AVatio) 

Among the probe set ID from lUumina Human-6 vl expression 
bead chip, two of them detect the expression of H2AFY gene: 
6620403 probe set ID which is common to tlu- two isoforms and 
6620403 which recognized specifically the sequence of the exon 6a 
of macroH2Al.l. Among the probe set ID from lUumina 
HumanHT-12 v4.0 expression bead chip, three detect the 
expression oi H2AFr gtnc: ILMN_2373495 and ILMN_1746171 
probe set ID which are common to the two isoforms and 
ILMN_1674034 which recognized specifically the sequence of the 
exon 6a of macroH2Al. 1. We applied the same formula as above 
with corresponding log2 RMA values. 

Analysis of correlation of macroH2A1.1/macroH2A1 
mRNA ratio and breast cancer cell lines markers 

For each GEO dataset analyzed, we conserved the intrinsic 
molecular subtype of breast cancer cell lines attribiit(xl in the 
original study or in absence of we attributed the molecular intrinsic 
subtype as defined in Table SI. Then we classified the difiFerent 
breast cancer cell lines into two groups, luminal/basal, or claudin- 
low/non claudin-low and compared the distribution of 
macroH2Al, macroH2Al.l expression levels or macroH2Al.l/ 
macroH2Al mRNA ratio values (Table S2). The reported j!)-values 
are the results of a two-tailed Mann- Whitney test. 

Survival analyses 

A summary of affymetrix microarray datasets used in this study, 
including the number of patients included in each stage of the 
analysis, is given in Table 1 . 

Follow-up data was available for 383 of the 579 TNBC samples 
from the GSE31519 dataset. AH survival intervals were measured 
from the time of surgery to the distinct survival endpoint used in 
the individual datasets. In the conduct of the presented analysis 
event free survival (EFS) was calculated as preferentially corre- 
sponding to the RES endpoint, but measured with respect to the 
DMFS endpoint if RFS was not available. Rody A. et al., [29] 
have previously shown that the effect of using these different 
endpoints was rather small in the overall dataset. Follow up data 
for those women in whom the envisaged end point was not 
reached were censored as of the last follow-up date or at 120 
months. Subjects with missing values were excluded from the 
analyses. For the analyses of untreated and adjuvant therapy 
treatment groups, we applied the Kolmogorov-Smirnov test to 
compare the cumulative distribution of the two data sets. 

To determine the cutoff value of macroH2Al.l/macroH2Al 
mRNA ratio, a receiver-operating characteristics (ROC) analysis 
was performed. We constructed Kaplan-Meier curves and used 
the log-rank test to determine the univariate significance of the 
variables. The predictive potential of macroH2Al.l/macroH2Al 
mRNA ratio is assessed by its positive and negative predictive 
values (PPV and NPV) (Table S3). A Cox proportional-hazards 
model was used to examine the effects of multiple covariates on 
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survived (Table S3). All P-values are two-sided and 0.05 was 
considered as a significant result. 

AU statistical analyses were performed using the XLSTAT 
version 2013.1. 

Accession codes 

Summary of public databases used: Gene Expression Omnibus 
(NCBI, Bethesda, MD, USA): GSE16795, GSE9691, GSE24202, 
GSE31519; ArrayExpress (EBI, Hinxton, UK): E-TABM-157, E- 
MTAB-183, E-MTAB-827, E-MTAB-884. 

Results 

Expression of i\/\acroH2A1 splice variants in breast cancer 
cell lines 

We quantified protein expression levels of macroH2Al, 
macroH2Al.l, macroH2A1.2 per se and each of the macroH2Al 
isoforms relative to the total macroH2Al protein pool in BC cell 
lines. MacroH2Al isoforms or macroH2Al expression levels did 
not differ significantly between cell lines representative of the ER 
positive luminal subtype, ZR-75 and MCF-7 cells, and cell lines of 
the ER negative basal subtype, MDA-MB436, Hs578T, and 
MDA-MB231 (Fig. lA). In contrast, expression levels of 
macroH2Al.l protein relative to total macroH2Al protein pool 
were greater in cells of the basal subtype compared to the luminal 
one (Fig. IB). Unlike macroH2A1.2, which showed no significant 
variation in expression levels, the increase in macroH2Al.l was 
greatest in MDA-MB231 t:ells. 

Using splice variant-specific primers (Fig.S2), we also deter- 
mined that macroH2Al.l transcription was greater in the basal 
compared to the luminal subgroup of BC cell lines, while mRNA 
expression levels of macroH2A1.2 or total macroH2Al mRNA 
did not differ significantly (Fig. IC). As at the protein level, this 
differential expression was consolidated by analysis of the 
proportional expression of macroH2Al.l relative to total 
macroH2Al (Fig. ID). 

To test whether this observation could be extended to a larger 
panel of BC cell lines, we analyzed macroH2Al expression in 51 
BC cell lines from data published by Neve et al, [30] (Fig. IE). 
Total macr<)H2Al expression levels were reduced in the basal BC 
cell lines ip — 0.006). Therefore, even if macroH2Al.l expression 
levels per se did not vary (p = 0.555), the relative proportion of 
macroH2Al.l to global macroH2Al was significantiy higher in 
the basal subtype (p = 0.027). 

Because the three basal BC cell lin(;s tested (Fig. lA-D) l)C'longed 
to the claudin-low subtype, we subdivided the cell lines from Neve 
et al., study into two groups: claudin-low and non claudin-low BC 
cell lines. Cell hne subtypes were attributed as in Prat et al., [31] 
(Table SI). Then we compared the relative expression levels of 
macroH2Al mRNAs. Increased macroH2Al.l expression levels 
appeared typical of claudin-low subtype BC cell fines (Fig.2A). 
This correlation was significant for the macroH2 A 1.1/ 
macroH2Al mRNA ratio (compare p-values Fig.2A center and 
right panels) and was further confirmed by our analysis of several 
independent studies that differed in the nature of the cell lines 
(Fig. S3) and the array platform used (Fig.2B). 

Finall)', the study of Lapuk et al. [32] allowed us to assess 
alternative splicing in 31 BC cell lines. We first analyzed relative 
expression levels of each exon of H2AFY gene except that of exon 
8 (data unavailable). Globally each exon was less expressed in the 
claudin-low than in the non-claudin low BC cell lines (Fig. 3 A). 
One exception appeared to be the exon 6a of macroH2Al.l whi[:h 
was expressed more strongly in Claudin-low BC cell lines, but the 
difference was not statistically significant (Fig.SB). We normalized 



the level of expression of each exon (log2 RMA values) relative to 
that of exon 9, the most expressed exon of the H2AFT gene. As 
shown in Fig.SC, expression levels of most of the exons of the 
H2AFY gene in claudin-low subtype cells decreased; and this was 
also true for the macroH2A1.2 specific exon 6b (Jj = 0.001) 
(Fig.SD). In contrast, expression levels of macroH2Al.l specific 
exon 6a increased in the claudin-low compared to the non 
claudin-low subtype (p = 0.007). Moreover, determination of the 
spKcing index showed that only exon 6a varied in aU cell lines 
tested (Fig.3E). 

We conclude that the expression of macroH2Al.l relative to 
total macroH2Al expression (macroH2Al.l/macroH2Al mRNA 
ratio defined in materials and methods), not macroH2Al.l 
expression per se, is correlated specifically with the claudin-low 
molecular subtype. 

MacroH2A1.1 variant expression correlates with 
epithelial-mesenchymal transition 

We classified the macroH2Al.l/macroH2Al mRNA ratios of a 
set of 38 BC cell lines relative to the E-cadherin expression data 
determined by Holiest ellc et al., [33] . In E-[:adherin'" ''^'"™' cell lines 
this ratio was generally greater and more diverse, but the 
difference between E-cadherinP°"''^^ and E-cadherin^'s^"^' ceUs 
was not statistically significant (Fig.4A). Different mechanisms for 
inactivating E-cadherin have been identified in human cancers: 
inherited and somatic mutations, increased promoter methylation, 
and induction of transcriptional repressors of the Twist, Snail and 
Zeb family members [6] . The latter induce EMT in parallel with 
induction of mesenchymal markers such as N-cadherin and/ or 
vimentin. Interestingly, most of the cell lines exhibiting high 
macroH2Al.l/macroH2Al mRNA ratios, expressed N-cadherin 
and vimentin (Fig.4B). 

In order to determine whether enrichment of macroH2Al.l 
could be related to the EMT process, we analyzed expression 
levels of this variant in different cellular models of EMT. 
Comparison of macroH2Al.l/macroH2Al mRNA ratios in 
HMLE_shGFP and HMLE_shEcad, revealed that reduction of 
E-cadherin expression levels was accompanied by an increase in 
the macroH2 A 1 . 1 mRNA ratio in two independent data sets 
(Fig.4C) [34,35]. This increase was clearly associated with 
induction of EMT due to dysfunction in intracellular signaling 
caused by reduced E-cadherin levels. Indeed, expression of a 
truncated form of E-cadherin (AN-Ecad) lacking the extracellular 
domain of the wild-type protein normally responsible for E- 
cadherin cell-cell adhesion was not correlated with an in( reuse in 
macroH2Al.lmRNA ratios (Fig.4C). Accordingly, overexpression 
of inducers of EMT, Twist 1, Goosecoid or Snail in the HMLE cell 
line was accompanied by an increase in macroH2Al.l mRNA 
(Fig.4D). In contrast, macroH2Al.l mRNA ratios were not up- 
regulated by overexpression of TGF-[5. This is in agreement with 
previous observations showing that induction of an EMT by SnaU 
or Twist does not depend on TGF-P autocrine signafing [6]. 
Moreover, TGF-P signafing is not sufficient for an EMT 
conversion in primary normal, immortalized, and neoplastic 
HMECs [36], and is thus insufficient to induce an increase in 
macroH2Al.l/macroH2Al mRNA ratio. 

Extensi\ e changes in alternative splicing play a role in shaping 
cellular behavior patterns that characterize EMT. Interestingly, 
the macroH2A1.2 specific exon was shown to be an Epithelial 
SpUcing Regulatory Protein (ESRP)-regulated cassette [4]. Anal- 
ysis of the genomic context of the exon specifically included in 
macroH2Al.l identified potential binding sites for the EMT- 
associated spficing factors, ESRPl and RBFOX2 (Fig.4E). ESRP 
binding sites located at the 5' end and within the regulated exon 
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Figure 1. Expression levels of nnacroH2A1 splice variants in breast cancer cell lines. A-B- Macrol-l2A1.1, macrol-l2A1 .2, and total 
macroH2A1 protein expression levels in five breast cancer cell lines. Quantification of each macrol-l2A1 splice variants, global macrol-l2A1 (A) and 
macrol-l2A1.1/macrol-l2A1, macroH2A1.2/macrol-l2A1 ratios (B) for each cell line normalized to GADPH is shown relative to the ZR-75 cell line. C-D- 
qPCR analysis of mRNA expression levels of macrol-l2A1, macroH2A1.1, macrol-l2A1 .2 and macroH2A1 splice variants/macrol-l2A1 mRNA ratio. Each 
quantification was performed at minimum in biological triplicate. Expression levels were normalized to expression of RPLPO, and referred to the cell 
line ZR-75 as a sample calibrator. E- Analysis of expression data of macroH2A1 variants in 51 breast cancer cell lines on the basis of U133A array 
hybridization [30]. Log2 macroH2A1, Macrol-l2A1.1 and macrol-l2A1.1/macroH2A1 values are determined and classified according to luminal or basal 
molecular subtype as defined in [30]. Data of DU4475, HCC1 008 and HCC1 599 are included in the analysis with the molecular subtype assigned in the 
synthesis part of Table SI. The median of each subgroup is shown (grey bar). The reported p-values are the results of a two-tailed Mann-Whitney test. 
doi:1 0.1 371 /journal.pone.0098930.g001 



seem to be ESRP Binding Splicing Inhibitors (EBSI), and a 
RBFOX2 binding site found downstream of the alternatively 
spliced exon seems to be an RBFOX2 Binding Splicing Enhancer 
(RBSE). Exon 6a skipping could thus result from the interaction of 



ESRPl with EBSI in an epithelial context. In a mesenchymal 
context, exon 6a would be preferentially included by enhanced 
binding of RBFOX2 (Fig.4E). 
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Figure 2. High macrol-iZAI.I expression level in breast cancer cell lines characterizes Claudin-low molecular subtype. In two 

independent analyses, macroH2A1, macroH2A1.1 and macroH2A1.1/macroH2Al mRNA ratios were determined for each cell line and classified 
according to claudin-low or non claudin-low molecular subtype assigned in the synthesis part of Table SI. The median of each subgroup are 
specified. In E-TABM-827 analysis, GI-101, HB4A, PMC42 and VP229 cell lines were omitted as the subgroup Basal A or B was not specified; HCC1509, 
MT3 and VP267 cell lines are omitted as their subtype were not assigned. The reported p-values are the result of a two-tailed Mann-Whitney test. 
doi:1 0.1 371 /journal.pone.0098930.g002 



Prognostic significance of macroH2A1.1 mRNA ratio in 
TNBC 

To assess the potential prognostic value of macroH2Al.l 
mRNA ratio in breast cancer, we analyzed the event-free survival 
of patients as a function of macroH2Al.l mRNA ratio reported 
for the GSE31519 dataset, which provides access to a large cohort 
of TNBCs (Table 1). 

We plotted Kaplan-Meier survival curves based on 
macroH2Al.l mRNA ratios segmented into two groups, high 
and low macroH2Al.l mRNA ratios. The Receiver Operating 
Characteristic (ROC) analysis was used to find the optimal cut-off 
level. We used the log-rank test to determine the univariate 
significance of the variables. Poor prognosis TNBCs had high 
macroH2Al.l mRNA ratios (p = 0.001) (Fig.5A). The positive 
predictive value (PPV) and negative predictive value (NPV) were 
49% and 69%, respectively. In multivariate Cox regression 
analysis, including age, histological grade, tumor size and lymph 
node status, the macroH2Al.l/macroH2Al mRNA ratio showed 
a trend as an independent predictor (HR 3.457, 95%CI 1.087 to 
W.990, 0.036) (Fig.5B). 

Among the chnico-pathological characteristics of TNBC 
patients included in the GSE31519 dataset, the use of systemic 
treatment was specified. Hence, we sub-divided the cohort into 
one untreated sub-cohort, and a second treated sub-cohort which 
regroups all adjuvant treated patients. Analysis of the 



macroH2Al.l/macroH2Al mRNA values in the untreated sub- 
cohort revealed a distribution comparable to the values of the total 
cohort (^ = 0.235) (Fig.S4). MacroH2Al.l/macroH2Al mRNA 
values of the group treated with an adjuvant therapy differed from 
those of the total cohort, with values shifted to higher values of the 
intervals {p = 0.008) (Fig.S4). Kaplan-Meier survival cur\'es were 
plotted according to macroH2Al.l mRNA ratios after segmen- 
tation into high and low groups as above for the two sub-cohorts. 
We observed that in contrast to treated tumors, high 
macroH2Al.l mRNA ratios still correlated with reduced survival 
curves for untreated tumor (p = 0.152 vs. p = 0.001) (Fig.6). 

Discussion 

We provide evidence that overexpression of macroH2Al.l 
correlates with major mesenchymal markers of the claudin-low 
breast cancer subtype. Notably, the increase in macroH2Al.l 
seems to be a residual track of an EMT process, correlated with 
poor prognosis in TNBCs. 

Claudin-low tumors are typically TNBCs with poor long-term 
prognosis, despite reduced expression of genes related to cell 
proliferation. Nevertheless, unlike prognostic signatures that rely 
heavily on proliferation-related genes, macroH2Al.l preferentially 
associated with non-proliferative phenomena. It would belong to a 
new prognostic marker class independent of proliferative status, 
similar to factors related to the immune system response [37]. 



PLCS ONE I www.plosone.org 



6 



June 2014 I Volume 9 | Issue 6 | e98930 



macroH2A1.1, an Epigenetic Mark of Poor Prognosis 



Data set 
E-MTAB-183 

-median 




Exon 1 Exon 2 Exon 3 Exon 4 Exon 5 Exons 6 Exon 7 Exon 9 



B 



1 130- 



I 12.0. 

(0 

I 11.0. 

X io.a 

> 90 ■ 

<3 „ . 

m 8.0. 

7.0. 



CM 



6.0 



Data set 
E-MTAB-183 



111.388 



Non 
Claud in-low 



I Exon 6b 
of 
macroH2A1.2 
p<0.001 
Exon 6a 
of 

macroH2A1.1 
p=0.216 



■8.777 



Claudin-low 



Data set 
E-MTAB-183 

























-median 

• 




4- 


• 


+ 




■•■ 




-i- 






.4.. 
























—TT- 
A 
























▲ 






non ■ 
CL 


non 
CL 


CL 


non 
CL 


CL 


non 
CL 


CL 


non 
CL 


CL 


non 
CL 


CL 


non ■ ■ 
CL 



Exon 1 



Exon 2 Exon 3 Exon 4 Exon 5 Exons 6 Exon 7 



0.0. 



-1.0. 



£ -2.0. 



■3.0. 



-4.0. 



-5.0. 



Data set 
E-MTAB-183 



hO.747 



^i^3.555 



Non 
Claudin-low 



0.002 



T 
4 



Exon 6b 
of 

maGroH2A1 .2 
p=0.001 

Exon 6a 
of 

macroH2A1.1 
p=0.007 



Claudin-low 



Data set 
E-MTAB-183 

■ p =0.0004—1 



-median 



I. 



1^6.476 



«88- 



Non ^, .. , 
^, ,. , Claudin-low 

Claudin-low 



Figure 3. Analysis of expression data of macroH2A1 variants in breast cancer cell lines on the basis of Affymetrix Human Junction 
technology (E-IV1TAB-183[1 1]). A- Log2 expression values of exons of H2AFY gene are represented by molecular subtype non claudin-low (non CL) 
and claudin-low (CL). The analysis for exons 6a/b are highlight in B. C- Log2 expression values of exons of H2AFY gene normalized to the one of exon 
9 are represented by molecular subtype non claudin-low (non CL) and claudin-low (CL). The analysis for exons 6a/b are highlight in D. E- Splicing 
index values for exon 6a included in macroH2Al.l splice variant are represented by molecular subtype. The median of each subgroup is shown (grey 
bar). The reported p-values are the results of a two-tailed Mann-Whitney test. 
doi:10.1371/journal.pone.0098930.g003 



Interestingly, it was shown tliat upon entering EMT HMECs 
develop a stable, low proliferative mesenchymal phenotype. 
MacroH2Al was identified as an epigenetic barrier which 
participates in the maintainance of cell identity and antagonizes 
induction of cell reprogramming to naive pluripotency [38,39]. 
Thus, macroH2Al.l could be involved in the maintenance of a 
mesenchymal state, partial or complete, by establishing an 
epigenetic barrier against further de/differentiation. 

The difficulty of identifying EMT-transitioning cells in vivo 
creates skepticism regarding the pathological relevance of EMT. 
One explanation for this is that cancer cells only undergo a 



transient EMT, reverting back to the epithelial state by a 
mesenchymal-epithelial transition (MET), making it difEcult to 
isolate cells with true EMT markers. Studies in experimental 
mouse models have shown that a complete EMT-MET cascade is 
important for tumor metastasis [40,41]. If the EMT process is so 
transient and, in parallel, so important for the development of 
metastatic tumors, why do only claudin-low and, to a lesser extent, 
metaplastic intrinsic: molecular subtypes of BC present molecular 
features of EMT? One explanation could be that in claudin-low 
tumors the EMT-MET turnover is trapped in an intermediate 
mesenchymal state, in which EMT markers are present. We 
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Figure 4. Overrepresentation of macroH2A1 .1 correlated with mesencliymal features and induction of EIMT features in IHMLE cells 
is accompanied by an increase in macroH2A1.1/macroH2A1 mRNA ratio. A-B Analysis of GSE16795 data set [33]. Macrol-l2A1.1/macrol-l2A1 
mRNA ratio are determined for each cell line and classified depending of the level of expression of E-cadherin (A); of the level of expression of 
Vimentin or N-cadherin (B). The median of macroH2A1.1/macroH2A1 values of each subgroup are specified. The reported p-values are the result of a 
two-tailed Mann-Whitney test. C- macroH2A1.1/macrol-l2Al mRNA ratios were determined in immortalized human mammary epithelial cells with 
inhibiting E-cadherin function either shRNA-mediated (GSE9691 [34] and E-MTAB-884 [45]) or by expression of a truncated form of E-cadherin (AN- 
Ecad) (GSE9691 [34]) and compared. D- Macrol-l2Al .l/macrol-l2A1 mRNA ratios were determined for different breast cancer stem cell-like lines which 
overexpressed one EMT inducer, i.e. TGPP, Twist, Gsc or Snail, and compared (GSE24202 [6]). The median of each subgroup is shown (grey bar). The 
reported p-values are the results of a two-tailed Mann-Whitney test. E- Upper panel- genomic sequence of I-I2AFY gene encompassing exon 6a. 
Potential ESRP1 and RBFox2 binding sites are represented with grey background and white letters. Bottom panel- Hypothetical schema for alternative 
splicing of exon 6a included in the macrol-l2A1.1 splice variant. Two cellular backgrounds are represented, epithelial with exon skipping of exon 6a 
related to the inhibitory binding of ESRPl to EBSI, and mesenchymal with exon inclusion of exon 6a potentiated by binding of RBFox2 to RBSE. 
dor:l 0.1 371/journal.pone.0098930.g004 



speculate tliat macroH2Al.l stabilizes chromatin organizations 
characteristic of transcriptional programs linked to paused cell 
cycle progression. Hence, macroH2Al.l expression could divert 
EMT-MET processes, stop progression and trap cells in such an 
intermediate state. 

High macroH2Al.l mRNA ratios in the slow cycling claudin- 
low molecular subtype [31] are correlated with earlier observa- 
tions that macroH2Al.l expression may be restricted to non- 
proliferative tissues [42], and that loss of its expression in lung and 
colon cancer was related to enhanced cell proliferation of cancer 
cells [24—26]. In the 67NR mouse model which formed primary 



carcinomas when implanted into mouse mammary fat pads, 
Dardenne et al., identified a high macroH2Al.l/macroH2Al 
ratio. Inversely, in the 4T1 mouse model, reduced macroH2Al.l 
expression was correlated with macroscopic metastatic capacity in 
the lung [43]. Our results point to high macroH2Al.l/ 
macroH2Al ratios as markers of engaged but paused intermediate 
cellular stages of the EMT. Because the metastasic power of a 
tumor clearly depends on a complete EMT-MET process, it is 
tempting to propose a model in which macroH2Al.l is hnked to 
the EMT process and macroH2A1.2 linked to the MET process. 
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Multivariate analysis of Event Free survival according to standard parameters 
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Age 


mean 51.59 
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Tumor grade 
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*lnformation of all parameters was available for 261 of the 383 TNBC samples with follow up data from GSE31519 data set 



Figure 5. Kaplan Meier analysis according to tlie macroH2A1.1 mRNA ratio. A- The 383 TNBC samples from the GSE31519 cohort were 
stratified according to the macroH2A1.1/macroH2A1 mRNA ratio. Kaplan Meier analysis of event free survival of 383 samples with follow up 
information is shown. Positive and negative predictive values (PPV and NPV) of macrol-l2A1.1/macrol-l2A1 mRNA ratio in the cohort are specified. B- 
Multivariate Cox proportional hazards models of disease-free survival. 
doi:l 0.1 371/journal.pone.0098930.g005 
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Figure 6. Kaplan Meier analysis according to the macroH2A1.1 mRNA ratio in untreated and treated sub-cohort. A- The 259 TNBC 
untreated samples from the GSE31519 cohort were stratified according to the macrol-l2A1.1/macrol-l2A1 mRNA ratio. Kaplan Meier analysis of event 
free survival of 259 samples with follow up information is shown. PPV and NPV of macroi-l2A1.1/macrol-l2A1 mRNA ratio in the untreated cohort are 
specified. B- The 87 TNBC adjuvant chemotherapy treated samples from the GSE31519 cohort were stratified according to the highest macrol-l2Al .1/ 
macroH2Al mRNA ratio. Kaplan Meier analysis of event free survival of 87 samples with follow up Information Is shown. PPV and NPV of 
macrol-l2Al.l/macroi-l2A1 mRNA ratio in the treated cohort are specified. 
doi:l 0.1 371 /journal.pone.0098930.g006 



TNBC is generally associated witti a poor outcome, whicli is 
essentially not predicted by assessment of standard clinico- 
pathological variables, such as lymph node status or tumour size 
at initial presentation. The lack of identified molecular targets in 
the majority of TNBCs implies that chemotherapy remains the 
treatment of choice for patients with TNBCs. Here we show that, 
regardless of the reason that led to an absence of adjuvant therapy 
for patients involved in the GSE31519 study, those with a high 
macroH2Al.l/macroH2Al mRNA ratio have a worse prognosis 
than those with a low one. Even if this obser\'ation clearly needs to 
be confirmed with a larger cohort, it is tempting to propose that 
assessing macroH2Al.l expression levels will allow the identifica- 
tion of TNBC patients who, despite favorable chnic-pathological 
variables such as lymph node status or tumour size at initial 
presentation, will have a worse prognosis and may benefit from 



tri;atment. Interestingly, EMT and cell dissemination, although 
long associated with advanced stage of tumor progression, can be 
found at pre-neoplastic developmental stages of tumors [44]. 
Identifying early EMT process in primary tumors could then allow 
detection of tumors progressing towards metastasis. As expression 
of macroH2Al.l seems to be correlated with EMT and 
unfavorable behavior in untreated TNBC patients, it is tempting 
to suggest macroH2Al . 1 expression levels as an early biomarker of 
tumor genesis. 

No difference in sur\dval of patients who underwent adjuvant 
treatment was seen with respect to the macroH2Al.l/ 
macroH2Al mRNA ratio. But, as the ratios of this sub-cohort 
are already globally higher than in untreated tumors (Fig.S4), one 
could speculate that chnico-pathological parameters that initially 
led to treatment may already correlate with higher macroH2Al . 1 / 
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macroH2Al mRNA ratio. It will be interesting to analyse this 
more in depth in a larger cohort. 

In conclusion, it will be tempting to test if the correlation 
between macroH2Al.l expression levels and EMT markers or 
poor prognosis in a TNBC cohort could be linked to a role for 
macroH2Al.l in the maintenance of a transient cellular state in 
the early EMT process towards metastatic development of breast 
tumors. 

Supporting Information 

Figure SI A- Characterization of a-macroH2Al antibodies and 

cc'll lines. The specificity of a-macroH2Al antibodies was verified 
using SDS-polyacr)'lamide gels low cross-linking (12.5% acrylam- 
ide, 1:125 bisacrylamide) to separate the two splice variants 
macroH2Al.l and macroH2A1.2. Total extracts of breast cancer 
cell lines were first resolved in SDS-polyacrylamide gels (standard 
(left) or low cross-linking (right panel)) then immunoblotted with ot- 
macroH2Al and a-ERot antibodies. Left panel: ERa, macroH2Al 
and H3 specific antibodic-s. Right: top panel; macroH2Al specific; 
middle panel: macroH2Al.l specific; bottom panel: 
macroH2A1.2 specific antibody. B- MacroH2Al.l, 
macroH2A1.2, and total macroH2Al protein expression levels 
in five breast cancer cell lines. Total protein extracts were 
immunoblotted with 0(-macroH2Al (bottom panel), a- 
macroH2Al.l (top panel) or a-macroH2A1.2 antibodies (middle 
panel). 
(TIF) 

Figure S2 Primers, qPCR parameters and probe set IDs 
summary. A- Sequences, genomic location and size of ampficons 
generated by primers used in qPCR reactions are summarized. 
Parameters of the standard curve are reported for each pairs of 
primers in each cell lines used. For a given amplicon, efficiencies in 
the different cell lines are reported and compared each other 
(STDEV(Es)). B- Characterization of probe set ID of Affymetrix 
U133A, Illumina Human-6 vl expression beadchip, lUuminaHu- 
man HT-12 v3.0 expression beadchip arrays corresponding to 
macroH2Al variants. For each probe, nucleotide reference 
sequences and macroH2Al isoforms recognized are reported, as 
nucleotide and genomic localization of the sets of oligonucleotides 
presents at the probe ID. C- Clustal W multiple alignment of 
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Figure S3 High macroHZAl.l expression level in breast 
cancer cell lines characterizes Claudin-low molecular 
subtype. MacroH2Al.l/macroH2Al mRNA ratios were deter- 
mined for each cell line and classified according to molecular 
subtype assigned in the synthesis part of Table SI. In GSE16795 
analysis [3.3], data from H3396 cell Une are omitted as its subtype 
was not assigned. The median of macroH2Al.l/macroH2Al 
values of each subgroup are specified. The reported p-vzlues are 
the result of a two-tailed Mann- Whitney test. 
(TIF) 

Figure S4 Analysis of the distribution of macroH2Al.l/ 
macroH2Al mRNA values in the different groups of 
patients studied. 

(TIF) 
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